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Abstract 

The primary tool for predicting infectious disease spread and intervention effectiveness is the mass 
action Susceptible-Infected- Recovered model of Kermack and McKendrick |24j . Its usefulness derives 
largely from its conceptual and mathematical simplicity; however, it incorrectly assumes all individuals 
have the same contact rate and contacts are fleeting. This paper is the first of three investigating 
edge-based compartmental modeling, a technique eliminating these assumptions. In this paper, we derive 
simple ordinary differential equation models capturing social heterogeneity (heterogeneous contact rates) 
while explicitly considering the impact of contact duration. We introduce a graphical interpretation 
allowing for easy derivation and communication of the model. This paper focuses on the technique and 
how to apply it in different contexts. The companion papers investigate choosing the appropriate level 
of complexity for a model and how to apply edge-based compartmental modeling to populations with 
various sub-structures. 

1 Introduction 

The conceptual and mathematical simplicity of Kermack and McKendrick's [211 Q] Mass Action Susceptible- 
Infected-Recovered (SIR) model has made it the most popular quantitative tool to study infectious disease 
spread for over 80 years. However, it ignores important details of the fabric of social interactions, assuming 
homogeneous contact rates and negligible contact duration. Improvements are largely ad hoc. spanning the 
range between mild modifications of the model and elaborate agent-based simulations 1 , 28 , 25 , 42 [T3 H7] . 
Increased complexity allows us to incorporate more realistic effects, but at a price. It becomes difficult to 
identify which variables drive disease spread or to address sensitivity to changing the underlying assumptions. 
In this paper we show that shifting our attention to the status of an average contact rather than an average 
individual yields a surprisingly simple mathematical description, expanding the universe of analytically 
tractable models. This allows epidemiologists to consider more realistic social interactions and test sensitivity 
to assumptions, improving the robustness of public health recommendations. 

We motivate our approach using the standard Mass Action (MA) SIR model. We are interested in the 
susceptible S(t), infected I(t), and recovered R(t) proportions of the population as time t changes. Under 
mass action assumptions, an infected individual causes new infections at rate f3S(t), where is the per- 
infected transmission rate and S is the probability the recipient is susceptible. Recovery to an immune 
state happens at rate 7. The flux of individuals from susceptible to infected to recovered is represented 
by a flow diagram (figure [T]) making the model conceptually simple. This leads to a simple mathematical 
interpretation, the low-dimensional, ordinary differential equation (ODE) system 

S = -j3IS, i = (3IS-jI, R = jl. 
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Fi gure 1: Mass action flow diagram. The flux of individuals from Susceptible to Infected to Recovered for the standard 
MA model. Each compartment accumulates and loses probability at the rates given on the arrows. 

The dot denotes differentiation in time. An ODE system allows for easy prediction of details such as early 
growth rates, final size, and intermediate dynamics. Using S + I + R = 1 we reexpress this as 

S = -j3IS, I = l-S-R, R = jl, (1) 

The product IS measures the proportion of contacts that are from an infected individual to a susceptible 
individual. 

The MA model often provides a reasonable description of epidemics; however it has well-recognized flaws 
which cause the frequency of infected to susceptible contacts to vary from IS. We highlight two. It neglects 
both social heterogeneity, variation in contact rates which can be quite broad |26[ 141 j . and contact duration, 
implicitly assuming all contacts are infinitcsimally short. Because of these omissions, model predictions can 
differ from reality. For example, due to social heterogeneity, early infections tend to have more contacts [5] 
and when infected may cause more infections than "average" individuals, enhancing the early spread over 
that predicted by the MA model [21 221 [321 E3 [31]. When contact duration is significant, an infected 
individual may have already infected its neighbors, reducing its ability to cause new infections. Because of 
these assumptions, the MA model predicts the same results for a sexually transmitted disease in a completely 
monogamous population, a population with serial monogamy, and a population with wide variation in contact 
levels with mean one. Intuitively we expect these to produce dramatically different epidemics, but no existing 
mathematical theory allows analytic comparisons. 

Over the past 25 years, attempts have been made to eliminate these assumptions without sacrificing 
analytical tractability. With few exceptions (notably [51]) these make an "all-or-nothing" assumption about 
contact duration: contacts are fleeting and never repeated or they never change. With fleeting contacts, 
social heterogeneity is introduced by adding multiple risk groups to the MA model: in extreme cases there 
are arbitrarily many subgroups (the Mean Field Social Heterogeneity model) (1, 29, 30, 4Q1I3S]. This model 
is relatively well understood and can be rigorously reduced to a handful of equations. With permanent 
contacts, social heterogeneity is indroduced through static networks [10l El 021 [231 [32] ■ Static network 
results typically give the final size but no dynamic information. Some attempts to predict dynamics with 
static networks use Pair Approximation techniques |14j relying on approximation of network structures. More 
rigorous approaches avoid these approximations, but are more difficult [53] [23 13 HI] ■ Of these, only [53] 
yields a closed ODE system, (see also [351 [20]). This model lacks an illustration like figure [l] hampering 
communication and further of development. Finite, nonzero contact duration is typically handled through 
simulation, which is usually too slow to study parameter space. 

Although no coherent mathematical structure to study social heterogeneity and contact duration exists, 
there have been studies collecting this data in real- world contexts [HJ [551 [SOI IE] • Typically the resulting 
measurements have been reduced to average contact rates to make the mathematics tractable. Much of 
the available and potentially relevant detail is discarded because existing models cannot capture the detail 
collected. 

We find that the appropriate perspective allows us to develop conceptually and mathematically simple 
models that incorporate social heterogeneity and (arbitrary) contact duration. This provides a unifying 
framework for existing models and allows an expanded universe of models. Our goal in each case is to 
calculate the susceptible, infected, or recovered proportions of the population, but we find that this can 
be answered more easily using an equivalent problem. We ask the question, "what is the probability that 
a randomly chosen test node u is susceptible, infected, or recovered?" Because u is chosen randomly, the 
probability it is susceptible equals the proportion susceptible S(t), and similarly for I and R. If we know 
S(t), then the initial conditions and R — 7/, I = 1 — S — R determine lit) and R(t) as in (JlJ. 
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Model 


Population Structure 


Section 


Configuration Model 
(CM) 


Static network with specified degree distribution, assigned using the proba- 
bility mass function P(/c). 
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Dynamic Fixed-Degree 
(DFD) 


Dynamic network for which each node's degree remains a constant value, 
assigned using P(/c). 




3.2 




Dormant Contact (DC) 


Dynamic fixed-degree network incorporating gaps between partnerships; a 
node may wait before replacing a partner. 




3.3 




Mixed Poisson model 
(MP) 


Static network with specified distribution of expected degrees n assigned 
using the probability density function p(ft). 




4.1 




Dynamic Variable- 
Degree (DVD) 


Dynamic network with degrees varying in time with averages assigned using 
p(«). 




4.3 




Mean Field Social 
Heterogeneity (MFSH) 


Population with a distribution of contact rates assigned using P(k) or p(n) 
and negligible contact duration. 


3. 
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Table 1: Populations to which we apply edge-based compartmental models. 



The probability u is susceptible is the probability no neighbor has ever transmitted infection to u. The 
method to calculate this is the focus of this paper. This probability depends on how many neighbors u 
has, the rate its neighbors change, and the probability that a random neighbor is infected at any given time. 
Because a random neighbor is likely to have more contacts than a random node, knowing the infected fraction 
of the population does not give the probability a neighbor of u is infected. We focus on the probability a 
random neighbor is infected rather than the probability a random individual is infected. Once we calculate 
this, it is straightforward to calculate the probability u is susceptible. The resulting edge-based compartmental 
modeling approach significantly increases the effects we can study compared to MA models with only a small 
complexity penalty. 

In this paper we consider the spread of epidemics in two general classes of networks, actual degree networks 
(based on Configuration Model networks [39, 45, 19]) and expected degree networks (based on Mixed Poisson 
[commonly called Chung-Lu] networks [91 1471 [5] V In both cases we can consider static and dynamic networks. 
In actual degree networks, a node is assigned k stubs where k is a random non-negative integer assigned 
independently for each node from some probability distribution. Edges are created by pairing stubs from 
different nodes. In expected degree networks, a node is assigned k where k is a random non-negative real 
number. Edges are assigned between two nodes u and v with probability proportional to k u k v . We develop 
exact differential equations for the large population limit, which we compare with simulations. Detailed 
descriptions of the simulation techniques are in the Appendix. 

We summarize the populations we consider in Table [T] We begin by analyzing the simplest edge-based 
compartmental model in detail, exploring epidemic spread in a static network of known degree distribution, 
a Configuration Model network. To derive the equations, we introduce a flow diagram that leads to a simple 
mathematical formulation. We next consider disease spreading through dynamic actual degree networks and 
then static and dynamic expected degree networks. The template shown here allows us to derive a handful 
of ODEs for each of these populations. Unsurprisingly, the stronger our assumptions, the simpler our formu- 
lation becomes. We neglect heterogeneity within the population other than the contact levels, assume the 
disease has a very simple structure, and assume the population is at equilibrium prior to disease introduction. 
The companion papers investigate conditions under which the simpler models are appropriate |38j and how 
to apply the technique to more complex population and disease structures |37j . 

2 Configuration Model Epidemics 

We demonstrate our approach with Configuration Model (CM) networks. A CM network is static with a 
known degree distribution (the distribution of the number of contacts). We create a CM network with N 
nodes as follows: We assign each node u its degree k u with probability P(k u ) and give it k u stubs (half- 
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Figure 2: Edge-based compartmental modeling for Configuration Model networks. The flow diagram for a static 
CM network. (Left) The tj>g, <j>j, and (pR compartments represent the probability that a neighbor is susceptible, infected, or 
recovered and has not transmitted infection. The 1 — 8 compartment is the probability it has transmitted. The fluxes between 
the <A compartments result from infection or recovery of a neighbor of the test node and the iAj to 1 — flux results from a 
neighbor transmitting infection to the test node. (Right) S, and R represent the proportion of the population susceptible, 
infected, and recovered. We can find S explicitly, and J and R follow as in the MA model. 



edges). Once all nodes are assigned stubs, we pair stubs randomly into edges. The probability a randomly 
selected node u has degree k is P(k). In contrast, the probability a stub of u connects to some stub of v is 
proportional to k v . So the probability a randomly selected neighbor of u has degree k is 

P n (k) = kP(k)/(K) 

See [ini [31] for more detail. 

We assume the disease transmits from an infected node to a neighbor at rate j3. If the neighbor is 
susceptible, it becomes infected. Infected nodes recover at rate 7. Throughout, we assume a large population, 
small initial proportion infected, the small initial proportion of stubs belonging to infected nodes, and growing 
outbreak. Our equations become correct once the number of infections NI has grown large enough to behave 
deterministically, while the proportion infected / is still small. While stochastic effects are important, other 
methods such as branching process approximations |12| (which apply in large populations with small numbers 
infected) maybe more useful. 

To calculate S(t), I(t), and R(t) we note that these are the probabilities a random test node u is in each 
state. We calculate S(t) by noting it is also the probability none of it's neighbors has yet transmitted to u. 
We would like to treat each neighbor as independent, but the probability one neighbor v has become infected 
is affected by whether another neighbor w of u has transmitted to u since u could infect v. Accounting for 
this directly requires considerable bookkeeping. A simpler approach removes the correlation by assuming 
u causes no infections. This does not alter the state of u: the probabilities we calculate for u will be the 
proportion of the population in each state under the original assumption u behaves as any other node, and 
so this yields an equivalent problem. Further discussion of this modification is in the Appendix. 

We define 9(t) to be the probability a randomly chosen neighbor has not transmitted to u. Initially 9 is 
close to 1. For large CM networks, neighbors of u are independent. So given its degree k, u is susceptible at 
time * with probability s{k,9(t)) = 9(t) k . Thus S(t) = J2 k P(k)s(k, 9{t)) = ip(8(t)) where 

i/)(x) = ^P{k)x k 

k 

is the probability generating function [5S] of the degree distribution [the properties of ip we use are that 
its derivative is J2k kP{k)x k ~ 1 , its second derivative is ^ fc k(k — l)P(k)x k ~ 2 , and ^'(1) = (K)]. For many 
important probability distributions, ip takes a simple form, which simplifies our examples. Combining with 
the flow diagram for S, I, and R in figure [2j we have 

R = jl, S = ip(6), I = l-S-R 

To calculate the new variable 9, we break it into three parts; the probability a neighbor v is susceptible 
at time t, <f>s; the probability v is infected at time t but has not transmitted infection to u, <f>i; and the 



4 



probability v has recovered by time t but did not transmit infection to u, <f>n. Then 6 = 4>S + §i + §r- 
Initially (f>$ and 9 are approximately 1 and (pi, <f>R are small (they sum to 9 — 4>s)- The flow diagram for <f)g, 
4>i, 4>r, and 1 — 9 (figure[2]) shows the probability fluxes between these compartments. The rate an infected 
neighbor transmits to u is 8 so the <fij to 1 — 9 flux is /3</>j. We conclude 9 = —8(f>j. To find 4>i we will use 
4>i — 9 — cf>s — 4>R an< i calculate <ps and <pn explicitly. 

The rate an infected neighbor recovers is 7. Thus the 4>i to 4>r flux is 70/. This is proportional to the 
flux into 1 — 9 with the constant of proportionality 7//?. That is, 4>r = J(fii, 4z(l — 9) — fi<f>i- Since 0/j 
and 1 — 9 both begin as approximately 0, we have fin — 7(1 — 9)//3 in the large population limit. To find 
05, recall a neighbor t> has degree k with probability P n [k) — kP(k)/ (K). Given k, v is susceptible with 
probability 9 k ~ 1 (we disallow transmission from u so k—1 nodes can infect v). A weighted average gives <f>s — 
E fc PniW*- 1 = E fc kPik^- 1 / (K) = tf(8)W(l). Thus 0/ = 9 - <p s - cp R = 9 - <//( W(l) - 7(1 - 9)/fi 
and 9 = —f3<j>i becomes 

^ = -^ + /? f|iT + 7(1 - 0) 

yielding 

it = 7l, S = i>{9), I=l-S-R. (3) 

This captures substantially more population structure than the MA model with only marginally more com- 
plexity This is the system of [35] and is equivalent to that of [53] . It improves on approaches of [SJ [27] 
which require either 0(M) or 0(M 2 ) ODEs where M is the (possibly unbounded) maximum degree. This 
derivation is simpler than [B3 because we choose variables with a conservation property, simplifying the 
bookkeeping. 

The edge-based compartmental modeling approach we have introduced forms the basis of our paper. 
Depending on the network structure, some details will change. However, we will remain as consistent as 
possible. 



2.1 TZq and final size 

One of the most important parameters for an infectious disease is its basic reproductive number TZq, the 
average number of infections caused by a node infected early in an epidemic. When IZo < 1 epidemics are 
impossible, while when TZq > 1 they are possible, though not guaranteed. For this model, we find that TZq 
is (see Appendix) 

B (K 2 - K) 
P + 1 (K) 

We want the expected final size if an epidemic occurs. We set 9 = and solve 

e{0 ° ) = JT^ + /3 + 7 W) () 

for 9(oo). If TZq > 1 this has two solutions, the larger of which is 9 = 1 (the pre-disease equilibrium). We want 
the smaller solution. The total fraction of the population infected in the course of an epidemic is R(oo) = 
l—ijj(9(oo)). These calculations of IZo and R(oo) are in agreement with previous observations [32. 421 [21 [31151]. 
When TZq < 1, our approach breaks down: full details are in the Appendix. 



2.2 Example 

We consider a disease with B = 0.6 and 7=1. Figure [3] compares simulations with solutions to our ODEs 
using four different CM networks, each with 5 x 10 5 nodes and average degree (K) = 5, but different degree 
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Figure 3: Configuration Model Example (section |2 . 2 1 ) . Model predictions (dashed) match simulated epidemics (solid) 
of the same disease on four Configuration Model networks with (K) = 5 and 5 X 10 5 nodes, but different degree distributions. 
Each solid curve is a single simulation. Time is set so t = when there is 1% cumulative incidence. The corresponding MA 
model (short dashes) based on the average degree does not match. 



distributions. In order from latest peak to earliest peak, the networks are: every node has degree 5, the 
degree distibution is Poisson with mean 5, half the nodes have degree 2 and the other half degree 8, and 
finally a truncated powerlaw distribution in which P(k) oc fc~"e~' c / 40 where v = 1.418. We see that the 
degree distribution significantly alters the spread, with increased heterogeneity leading to an earlier peak, 
but generally a smaller epidemic. Our predictions fit, while the MA model using (3 = /3 (K) fails. 

3 Actual Degree Models 

For the CM networks, each node has a specific number of stubs. Edges are created by pairing stubs, and 
no changes are allowed. In generalizing to other "actual degree" models, we assign each node a number of 
stubs, but allow edges to break and the freed stubs to create new edges. We consider three limits: In the 
first, the Mean Field Social Heterogeneity model, at every moment a stub is connected to a new neighbor. 
In the second, the Dynamic Fixed-Degree model, edges last for some time before breaking. When an edge 
breaks, the stubs immediately form new edges with stubs from other edges that have just broken. In the 
third, the Dormant Contact model, we assume edges break as in the Dynamic Fixed-Degree model, but stubs 
wait before finding new neighbors. 

3.1 Mean Field Social Heterogeneity 

We analyze the Mean Field Social Heterogeniety (MFSH) model similarly. We take 9 as the probability a 
stub has never transmitted infection to the test node u from any neighbor. To define (f>s, 4>i, and 4>r we 
require that the stub has not transmitted infection to u and additionally the current neighbor is susceptible, 
infected, or recovered. Since at each moment an individual chooses a new neighbor, the probability of 
connecting to a node of a given status is the proportion of all stubs belonging to nodes of that status. We 
must track the proportion of stubs that belong to susceptible, infected, or recovered nodes 7r5, ttj, and ttr. 
Because of the rapid turnover of neighbors, we find that <ps is the product of the probability that a stub 
has not transmitted 9 with the probability it has just joined with a susceptible neighbor tts so 4>s = Otts- 
Similarly 4>i = Ottj and 4>r = Ottr. 

We create flow diagrams as before. The S, I, and R diagram is unchanged, but the diagram for the <fi 
variables and 1 — 9 changes. There are no <ps to 4>j or 4>j to (f>R fluxes because of the explicit assumption 
that the neighbors at any two times are independent. The change in neighbor status is due to change of 
neighbor. The flux into 1 — 9 from <pi is P<pi as before. We need a new flow diagram for 7r s , 7rr, and itr 
similar to that for S, I, and R. Stubs belonging to infected nodes become stubs belonging to recovered nodes 
at rate 7, thus ttr = 77rr. We calculate tts explicitly: the probability a stub belongs to a node of degree k 
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Figure 4: Mean Field Social Heterogeneity model. The flow diagram for the MFSH model (actual degree formulation). 
Because contacts are durationlcss, neighbors do not change status while joined to an individual, so there is no flux between the 
rf> variables (left). The new variables ns, tti, and itr (bottom right) represent the probability that a randomly selected stub 
belongs to a susceptible, infected, or recovered node. We can find irg in terms of 9 and then solve for 717 and 7tjj in much the 
same way we solve for I and R in the CM model. We then find each variable is 8 times the corresponding it variable. 



is kP{k)/ (K), and the probability the node is susceptible is 9 k . Taking the weighted average of this we find 
tt s = W(0)W(1). Finally, ttj = 1 - n s - tt r . 

Combining these observations ttr — 7717 = ~f(f)i/9 = —(7/ '13)0/0. So 717? = — (■y/j3) In 6 (the constant of 
integration is 0). We have 717 = 1 - n s - ttr = 1 - 9ip' (9)/^' (I) + (j//3) hi 6 and 9 = -/?6>7Tj. Thus 

9 = -pe + p e ^^--e 7 ln0 (6) 

1p'{l) 

R = -yI, S = <tp(0), I = l-S-R (7) 

The MFSH model has been considered previously [H [53 EHl HOI HI] , with the population stratified by 
degree. Setting £ to be the proportion of all stubs which belong to infected nodes (equivalent to ttj above), 
the pre-existing system is 

S k = -/3hS k ( 
I k = /3kS k ( - ih 
E k kP(k)h 



c 



(K) 



where S k and I k are the probabilities a random individual with k contacts is susceptible or recovered. A 
known change of variables reduces this to a few equations equivalent to ours (see Appendix) . 



3.1.1 IZo and final size 

We find 



n * {R2) (8) 



consistent with previous observations Ij. The total proportion infected is R(oo) = 1 — ip(9(oo)) where 



9(oo) = cxp 

Full details are in the Appendix. 



p ( 1 9(oo)ij'(9(oo)) 
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(9) 
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Figure 5: Mean Field Social Heterogeneity Example (section |3.1.2| |. Model predictions (dashed) match a simulated 
epidemic (solid) in a population of 5 X 10 5 nodes. The solid curve is a single simulation. Time is set so t = when there is 1% 
cumulative incidence. 



3.1.2 Example 

We take a population with degrees 1, 5, and 25. The proportions are chosen such that an equal number of 
stubs belong to each class: P(l) = 25/31, P(5) = 5/31, and P(25) = 1/31. Thus 



ip{x) 



2hx + 5x 5 + x 25 
31 



We set (3 = 7 = 1 and compare a simulation in a population of 5 x 10 5 with theory in figure [5] 



3.2 Dynamic Fixed-Degree 

The Dynamic Fixed-Degree (DFD) model interpolates between the CM and MFSH models. We assign 
each node's degree k as before and pair stubs randomly. As time progresses, edges break. The freed stubs 
immediately join with stubs from other edges that break, a process we refer to as "edge swapping" . The 
rate an edge breaks is r\. 

We develop flow diagrams (figure [6]) as before. The S : I, and R diagram is unchanged. We again track 
the probabilities 7rg, 7r/, and ttr that a random stub belongs to a susceptible, infected, or recovered node. 
The diagram is unchanged. The diagram for 9 and the variables changes: We have fluxes from to </>/ 
and 4>i to 4>r representing infection or recovery of the neighbor as in the CM model, but we also have fluxes 
from (f>s to 4>s , 4>i, or (j)R resulting from edge swapping. We have similar edge swapping fluxes from 0/ and 
4>r. The flux into <ps from edge swapping is rjOns- The flux out of 4>s from edge swapping is rjips- Similar 
results hold for <fii and 4>r. 

Our earlier techniques to find </>/ break down. We solve for <fis and 4>i using ODEs. To complete the 
system, we need the 4>s to <fii flux. Consider a neighbor v of our test node u such that: the stub belonging 
to u never transmitted to u and the stub belonging to v never transmitted to v prior to the u-v edge forming 
Given this, the probability v is susceptible is q = J^k kP(k)6 k ^ 1 j (K) = ip' (0) / ip' (1) . Thus, given that v is 
susceptible, v becomes infected at rate 



^'(W(i) 



00. 



V>'(0) 



Thus the 4>s to <j)j flux is the product of <f>s , the probability a stub has not transmitted infection to the test 
node and connects to a susceptible node, with f3cf)jij}" (9) /ip' (0), the rate the node becomes infected given 
that the stub has not transmitted and connects to a susceptible node. This completes figure [6j 
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Fi gure 6i Dynamic Fixed-Degree model. The flow diagram for the DFD model. Unlike the CM case, we cannot calculate 
<j>S explicitly, so we must calculate the 4>$ to <pj flux. 



The model requires more equations, but remains relatively simple: 



tp"(9) 



■qOirj — (/J + 7 + ??)(/>/ 
(9i//(6>) 



^'(1) 
S-(i) = , 



7Tj = 1 - 7Tr : - TT S . 

I(t) = l-S-R. 



This is simpler than, but equivalent to, the model of 



(10) 
(11) 

(12) 

(13) 
(14) 



3.2.1 IZo and final size 

We find 



n a = 



P 



03 + 77 + 7) V 7 



V + 1 (K^-K) 

7 



(K) 



(15) 



We do not find a simple expression for final size. Instead we must solve the ODEs numerically. Full details 
are in the Appendix. 

3.2.2 Example 

We choose a population having negative binomial degree distribution NB(4, 1/3) with size r = 4 and prob- 
ability p = 1/3. Thus P(k) = ( k+r k ~ 1 )(l-p) r p k . The mean is 2 and the variance 3. For negative binomial 
distributions ij){x) = [(1 — p)/(l — px)] r , so 



tjj(x) = 



3 — x 
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Figure 7: Dynamic Fixed-Degree Example (section |3.2.2[ ). Model predictions (dashed) match the average of 102 
simulated epidemics (solid) in a population of 10 4 nodes. For each simulation, time is chosen so that t = corresponds to 3% 
cumulative incidence. Then they are averaged to give the solid curve. 

We take j3 — 5/4, 7 = 1, and rj = 1/2. The equations accurately predict the spread (figure[7]). 

Our simulations are slower because we must track edges, so we have used smaller population sizes. To 
reduce noise, we perform 250 simulations, averaging the 102 that become epidemics. 

3.3 Dormant Contacts 

We finally generalize the DFD model, allowing stubs to enter a dormant phase after edges break. This 
Dormant Contact (DC) model is appropriate for serial monogamy where individuals do not immediately find 
a new partner. It is the most general model we present: it reduces to any model of this paper in appropriate 
limits [SI]. 

A node is assigned k m stubs using the probability mass function P(k m ). We take ip(x) — J^k P{k m ) x . 
A stub is dormant or active depending on whether it is currently connected to a neighbor. The maximum 
degree of a node d the "active" and "dormant" degrees are k a and k^ respectively, k a + £7 = k m . In 

addition to </>s, (f>i, and 4>B, we add 4>d denoting the probability a stub is dormant and has never transmitted 
infection from a neighbor, so 9 = <fis + <Pi + <fin. + 4>d- Active stubs become dormant at rate 772 and dormant 
stubs become active at rate 771. 

We now develop flow diagrams (figure [8]). The diagram for S, I, and R is as before. The diagram for 6 
and the (f> variables is similar to the DFD model, but with the new compartment (pp. The fluxes associated 
with edge breaking are at rate 772 times <f>s , 4>ii or 'Pr an d go from the appropriate compartment into 4>d, 
for a total of 772(0 — <Pd)- To describe fluxes due to edge creation, we generalize the definitions of tts, 717, 
and 7Tfj to give the probability a stub is dormant (and thus available to form a new contact) and belongs to 
a susceptible, infected, or recovered node, with n = its + 717 + Tr the probability a stub is dormant. The 
probability a new neighbor is susceptible, infected, or recovered is 7rs/7r, 717/71-, and itr/it respectively. The 
fluxes associated with edge creation occur at total rate tji^d, with proportions irs/n, ni/n, and ttr/tt into 
4>s-, 4>i, and 4>r respectively. 

The flow diagram for the 7r variables is related to that for the DFD model, but we must account for 
active and dormant stubs. We use £s, £i> and £-R to be the probabilities a stub is active and belongs to each 
type of node, with 1 — 7r = £ — £5 + £/ + £r the probability a stub is active. The 7r5 to £s and £5 to Tig 
fluxes are r/iTTs an d V2S.S respectively. Similar results hold for the other compartments. We can use this to 
show £ = — r/2^ + ?7i7r = — + r /i(l — £) from which we can conclude that (at equilibrium) 7r = 772/ (vi + %) 
and £ = 771/(771 +772). The fluxes from £/ and 717 to £r and ttr respectively represent recovery of the node 
the stub belongs to and so are 7^/ and 7717 respectively. 

We can calculate £s and its explicitly. The probability a dormant stub belongs to a susceptible node 
is 7T5 = <p pip' (0) / ip' (1) where </)£> is the probability the dormant stub has never received infection, and 
ip' (9) / ip' (1) is the probability that none of the other stubs have received infection. Similarly, the probability 
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Figure 8: Dormant Contact model. A flow diagram accounting for the dormant stage. (Left) Movement of stubs between 
different stages, including dormant. Stubs are classified by whether they have received infection and the status (or existence) 
of the current neighbor. (Middle) The flow of individuals between different states. (Right) Movement of stubs between states 
with stubs classified based on the status of the node they belong to. 



an active stub belongs to a susceptible node is £s = 
£j = £ — £s — £r to simplify the system further. 
Our new equations are 



e 



RA A M *8. , 

-p<PiPs ^77gx + V1—9D - mvs , 
772(6* - <j> D ) - r\x4>D , 



(j>D)tp'{0)/tp'{l). We can use 717 = n — ns — kr and 

(16) 
(17) 



t (ft A 



a- 

m + m 

3.3.1 T^o and final size 

We can show that 

= 



kr = T]2^r - rji^R + 7 71 "/ j 



>(i) ' 



717 = 7T - TTS - 7I\R • 



»?i + m 
s = . 



l-S-R. 



( K m - K m) Vl V2+1 



mm 



/3 + rj2+l\ (K m ) V1+V2 7 7(7 + m + m) 
However, we have not found a simple final size relation. Details are in the Appendix. 



(18) 
(19) 
(20) 

(21) 
(22) 
(23) 



(24) 



3.3.2 Example 

In figure |9j we consider the spread of a disease through a network with dormant edges. The distribution of 
km is Poisson with mean 3 



ip(x) 



= e -3(l-x) 
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Fi gure 9; Dormant Contact example (section |3.3."2j ) . The average of 155 simulated epidemics in a population of 5000 
nodes (solid) with a Poisson maximum degree distribution of mean 3 compared with theory (dashed). Simulations are shifted 
in time so that t = corresponds to 3% cumulative incidence and then averaged. Because stochastic effects are not negligible, 
the individual peaks are not perfectly aligned, so the averaging has a small, but noticeable, effect to reduce and broaden the 
simulated peak. In larger populations this disappears. 

The parameters are (3 — 2, 7 = 1, 771 = 7, and 772 = 7/2. Simulations are again slow, so we use a smaller 
population with 322 simulations, and average the 155 that became epidemics. 



4 Expected Degree Models 

We now consider SIR diseases spreading through "expected degree" networks. In these networks, each 
individual has an expected degree k, which need not be an integer. It is assigned using the probability 
density function p(n). Edges are placed between two nodes with probability proportional to the expected 
degrees of the two nodes. In the actual degree models, once a stub belonging to u was joined into an 
edge, it became unavailable for other edges: the existence of a u-v edge reduced the ability of u to form 
other edges. In contrast for expected degree models, edges are assigned independently: a u-v edge does not 
affect whether u forms other edges. Similarly to before, a neighbor will tend to have larger expected degree 
than a randomly chosen edge. The probability density function for a neighbor to have expected degree k is 

Pn(«) = «/?(«)/ (K). 

Our approach remains similar. We consider a randomly chosen test node u which cannot infect its 
neighbors, and calculate the probability u is susceptible. We first consider the spread of disease through 
static "Mixed Poisson" networks (also called Chung-Lu networks), for which an edge from u to v exists 
with probability k u k v / (N — 1) (K). We then consider the expected degree formulation of Mean Field Social 
Heterogeneity. Following this, we consider the more general Dynamic Variable-Degree model, for which 
a node creates and deletes edges as independent events, unlike the DFD model where deleted edges were 
instantantly replaced. The Mixed Poisson model produces equations effectively identical to the CM equations. 
The Mean Field Social heterogeneity equations differ somewhat from the actual degree version, but may be 
shown [38] to be formally equivalent. The Dynamic Variable-Degree equations are simpler than the DFD 
equations, and it may be more realistic because it does not enforce constant degree for an individual. 

4.1 Mixed Poisson 

We now consider the Mixed Poisson (MP) model. In this model, each node is assigned an expected degree k 
using the probability density function p{n). A u-v edge exists with probability k u k v /(N — 1) (K) indepen- 
dently of other edges. We use the name "Mixed Poisson" because at karge N the actual degree of nodes with 
expected degree k is chosen from a Poisson distribution with mean k. The degree distribution is a mixture 
of Poisson distributions. 
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Figure 10: Mixed Poisson model. (Left) The flux of edges for a static Mixed Poisson network. (Right) The flux of 
individuals through the different stages 



Consider two nodes u and v whose expected degrees are «„ and k.„ — k u + Ak with Ak <C 1. Our 
question is, how much does the additional Ak reduce the probability v is susceptible? At leading order it 
contributes an extra edge to v with probability Ak, and we may assume it contributes at most one additional 
edge. We define 8 to be the probability an edge has not transmitted infection. With probability 9Ak there 
is an additional edge which has not transmitted. The probability the extra Ak either does not contribute an 
edge or contributes an edge which has not transmitted is 1 — Ak + 9Ak = 1 — (1 — 9) Ak. If s(k, t) is the 
probability a node of expected degree k is susceptible, then we have s(k + Ak, t) = s(k, t)[l — (1 — 0)Ak]. 
Taking Ak — !• 0, this becomes ds/dn = -(1 - 9)s. Thus s(k, t) = exp[— k(1 - 9)] and S{t) = \&(0(t)) where 

poo 

Jo 

Note that this is the Laplace transform of p evaluated at 1 — x. As before, figure [T0| gives 

R = -yI, 5 = *(9), I = l-S-R 

and we need 9(i). 

We follow the CM approach almost exactly. The value of 9 is the probability an edge has not transmitted 
to the test node u. We define $5, and <&r to be the probabilities an edge has not transmitted to u 
and connects to either a susceptible, infected, or recovered node, so 8 = $5 + $j + $r. To calculate 
$5, we observe that a neighbor v of u with expected degree k has the same probability of having an edge 
to any w ^ u as any other node of expected degree k because edges are created independently of one 
another. Thus given k, v is susceptible with probability s(n,t). Taking the weighted average over all k gives 
<f>s = / °° p n (K)s(n,t) dt = J °° Kexp[-«(1 - 6)]/>(«) d/c/ {K) = *'(9)/*'(l). The same techniques as for 
the CM networks give &r = 7(1 — ©)//?• Our equations are 

= -/?8 + /3^ I i+ 7(1-0), 

R = jl, 5 = *(8), I = l-S-R. 

These equations are almost identical to those of CM epidemics except that VP and 9 replace 4> an d 9 ■ This 
is not coincidence. In fact the MP networks are a special case of CM networks [35] ■ 

4.1.1 TZq and final size 

We find 

° /3 + 7 (K) ' 
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Fi gure 11; Mixed Poisson Example (section |4.1."2j ). Model predictions (dashed) match a simulated epidemic (solid) in 
a Mixed Poisson network with 5 X 10 5 nodes. The solid curve is a single simulation. Time is chosen so that t = corresponds 
to 1% cumulative incidence. 



where \K 2 J is the average of k 2 (which equals the average of k 2 — k) and (K) is the average of K (which 
equals the average of k). The total proportion infected by an epidemic is i?(oo) = 1 — ^(©(oo)), where 



e(oo) = 



P *'(6(oo)) 



/3 + 7 /3 + 7 4"(1) 



Full details are in the Appendix. 



4.1.2 Example 

We consider a population whose distribution of expected degrees satisfies 



p{k) 



\ < k < 2 
i 10 < k < 20 
Otherwise 



Half the individuals have an expected degree between and 2 uniformly, and the other half have expected 
degree between 10 and 20 uniformly. This gives 

1 / , , e 20(x-l) _ 10(x-l) 

We take f3 — 0.15 and 7 = 1, and perform simulations with a population of size 5 x 10 5 generated using the 



algorithm of [36] . We compare simulation and prediction in figure 11 



4.2 Mean Field Social Heterogeneity 

For the expected degree formulation of the Mean Field Social Heterogeneity (MFSH) model, the probability 
an edge exists between u and v at time t is k u k v /(N — 1) (K). Whether this edge exists at one moment is 
independent of whether it exists at any other moment and what other edges exist. 

As before, we consider two nodes whose expected degrees differ by Ak and ask how much the additional 
Ak reduces the probability of being susceptible. The definition of is slightly more problematic here because 
having an edge at one moment is independent of having one later. So it does not make sense to discuss the 
probability an edge did not transmit previously because the edge did not exist previously. Instead, we note 
that in the MP case (1 — 0)Ak could be interpreted as the probability that the additional amount of Ak 
ever contributed an edge that had transmitted infection. Guided by this, we define so that as Ak — > 0, 
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Fi glire 12: Mean Field Social Heterogeneity model. The flow diagram for a Mean Field Social Heterogeneity population 
(expected degree formulation). This is similar to the actual degree formulation in figure |4] The new variables lis, II/, and 
II/j (bottom right) represent the probability that a newly formed edge connects to a susceptible, infected, or recovered node, 
they can be thought of as the relative rates that each group forms edges. Since the test node does not cause infections, and 
the probability a contact is with a node of a given re is equal to the probability a new contact will be with a node of the given 
re, the probability a current neighbor has a given state is equal to the probability a new neighbor has that state. Thus each <& 
variable equals the corresponding II variable. 



the extra amount of Ak has at some time contributed an edge that transmitted to the node with pobability 



(1 — 0)Ak. This again leads to ^f(x) — / °° e K ' 1 x 'p(k) dn. The flow diagram for individuals (figure 12 1 is 
unchanged: 

R = ~/I, S = #(6), I = l-S-R 

To find the evolution of 0, we define <&,g, and to be the probabilities a current edge connects 
a to a susceptible, infected, or recovered node. The probability that a small extra amount Ak currently 
contributes an edge and previously had a different edge that transmitted scales like Ak 2 (1 — 0). Since 
Ak 2 <C Ak, this is negligible compared to the probability that there is a current edge. We conclude that 
at leading order, $jAk, $/Ak, and QrAk give the probability that the Ak contributes a current edge 
connected to a susceptible, infected, or recovered node and there has never been a transmission due to this 
extra Ak. 

We can construct a flow diagram between $sAk, <&/Ak, $^Ak, and (1 — 0)Ak. Because all of these 
have Ak in them, we factor it out to create to just use $5, $/, $_r, and 1 — 0. Because edges have no 
duration, there is no $5 to $/ or $/ to flux (similar to the actual degree MFSH model). Instead there 
is flux in and out of these compartments representing the continuing change of edges. The $/ to 1 — flux 
is 

Because edges have no duration, the probability an edge connects to an individual of a given type is the 
probability a new edge connects to an individual of that type: $s = lis, ®i — II/, and $^ = 11/ where lis, 
Hj and Hr are the probabilities that a newly formed edge connects to a susceptible, infected, or recovered 
node[j] We have Ilg = ^'(6)/*'(l) and tl R = 7II/. Since 11/ = <f>/, this means IIr = —78/$. Integrating 
gives Il R = 7 (1 - 0)//3. So III = 1 - *'(0)/*'(l) - 7(1 - @)//3. Since = -/3$/ = -/3II/, we finally have 

*'(©) 

R = jl, S = ■$(&), 1 = 1- S-R 

Under appropriate limits the MFSH equations in k reduce to those in k and vice versa, so the models are 
equivalent [38] . Surprisingly, this system differs from the MP equations only in the first term of the 
equation. 



1 Unlike the actual degree formulation we do not need a factor of © in these because the smallness of Are allows us to assume 
there has never been a previous transmission. 
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Fi glire 13i Expected Degree MFSH Example (section |4.2."2j ) . Model predictions (dashed) match a simulated epidemic 
(solid) in a MFSH network with 15 X 10 6 nodes. The solid curve is a single simulation. Time is chosen so that t = corresponds 
to 0.5% cumulative incidence. 



4.2.1 TZq and final size 

We find 



Tin = 



7 (K) 

and the final size is R(oo) = 1 — >F(0(oo)) where 0(oo) solves 



Full details are in the Appendix. 



0=^(1 



4.2.2 Example 

For our example we take a population with p(n) — e K /(e 3 — 1) for < K < 3 and otherwise, giving 

3x _ i 



x{e 3 - 1) 



We take 7 = 1 and /? = 0.435 and compare simulation with theory in figure 13 We choose these parameters 
to demonstrate that the approach remains accurate for small Ho = 1.04. We use a population of 15 x 10 6 . 
There is noise since the epidemic does not infect a large number of people. 



4.3 Dynamic Variable-Degree 

The Dynamic Variable-Degree (DVD) model interpolates between the MP model and the expected degree 
formulation of the MFSH model. Each node is assigned k using p{n) and creates edges at rate nrj (joining 
to another node also creating an edge). Existing edges break at rate 77. Thus a node has expected degree k, 
though its value varies around k. In fact it is Poisson distributed over time. 

We define such that the probability a small Ak has ever contributed an edge that has transmitted 
infection is (1 — 0)Ak. We again have 



S = *(0), 



1= 1 -S- R 



We generalize the earlier definitions and define $s, $/, and to be the probabilities a current edge has 
never transmitted infection and connects to a susceptible, infected, or recovered node. 
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Fi gure 14; Dynamic Variable-Degree model. The flow diagrams for the DVD model. 

We define lis, II/, and H R to be the probabilities a new edge connects to a susceptible, infected, or 
recovered node. We have II5 = ty'(Q)/^'(l), 11/ = 1 — lis — II/?, and II/? = 7II/. We build the flow diagram 
for $jA/{, $/Ak, $/?Ak, and (1 — 0)Ak. There is flux into f^As at rate 77II5AK because this is the rate 
that the Ak leads to edge creation. There is flux out of $sAk at rate 77$sAk because existing edges break 
at rate 77, and the probability such an edge exists is fgAft. Similar fluxes exist for $/ and $/?. The flux out 
of $/A/t into QrAk is 7<I>/Ak as before, and the flux into (1 — 8)Ak is /3$/Ak. We factor out Ak and the 
flow diagrams (figure 14 1 are defined. 

Because the existence of an edge from the test node u to the neighbor v has no impact on any other edges 
v might have, the contacts v has aside from u are indistinguishable from the contacts of another node with 
the same k, and so they are susceptible with the same probability: $g = rig. The fluxes into and out of $5 
from edge creation/deletion balance, and the $5 to <&/ flux is simply — $5. Using this and the other fluxes 
for $/, we conclude $/ = —$5 + rjHj — (77 + 7 + /?)$/. Since tin = 7 I1/ and O = — f3$i, we can integrate 
this and find $/ = -*'(e)/*'(l) + r)Il R /j + (/3 + 77 + j)Q/P - (77 + 7)//?. So 6 = -y3$r can be written in 
terms of and II/j. We arrive at 



6 = -/36 + p-^l +7(1 - 6) + ry I 1 - 6 



n R = 7 ri/ 



n s - *'(e)/*'(i) , 



-n 



7 

Uj = 1 - n s 

l-S-R. 



n, 



(25) 

(26) 
(27) 



This is simpler than the DFD case because the smallness of Ak allowed us to assume that no previous 
transmission occurred. 



4.3.1 IZq and final size 

We find 



Un = 



P ?7 + 7 



K- 



/3 + ?7 + 7 7 (K) ' 
The total proportion infected is R(oo) = 1 — \&(0(oo)) where 

P / / 77 + 7 *'(e(oo)) , 77 + 7 77 

B(oo) - 



/3 + T7 + 7 V 7 /3 7 



Full details are in the Appendix. 
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Figure 15: Dynamic Variable-Degree Example (section |4.3.2[ ). Model predictions (dashed) match the average of 92 
simulated epidemics (solid) in a population of 10 4 nodes. For each simulation time is chosen so that t = corresponds to 3% 
cumulative incidence. Then they are averaged to give the solid curve. 



4.3.2 Example 

We choose the same distribution of k as of k in the DFD example, NB(4, 1/3). We find 

* 4 

#0) 



We take the same parameters /3 = 5/4, 7 = 1, and r\ = 1/2. Figure 15 shows that the equations accurately 
predict the spread. As in the DFD and DC case, we use an average as the comparison point, taking 240 
simulations and averaging the 92 resulting in epidemics. 

The final size is larger than for the DFD model. Although the average numbers of contacts are all the 
same, the increase in transmission routes when an individual had more contacts than expected outweighs 
the decrease when the number was less than expected. The net effect is to increase the final size. 



5 Discussion 

We have introduced a new approach to study the spread of infectious diseases. This edge-based compart- 
mental modeling approach allows us to simultaneously consider the impacts of contact duration and social 
heterogeneity. It is conceptually simple and leads to equations of comparable simplicity to the mass action 
model. It produces a broad family of models which contains several known models as special cases. It fur- 
ther allows us to investigate the effect of many behaviors which have previously been inaccessible to analytic 
study. 

A significant contribution of this work is that it allows us to study the spread of a disease in a popu- 
lation for which some individuals have different propensity to form contacts while allowing us to explicitly 
incorporate the impact of contact duration. The interaction of contact duration and numbers of overlapping 
contacts plays a significant role in the spread of many diseases, and in particular may play an important role 
in the spread of HIV. These techniques open the door to studying these questions analytically rather than 
relying on simulation. 

The edge-based compartmental modeling approach has a simple, graphical interpretation through flow 
diagrams. This simplifies model description, and guides generalizations. Because the derivations are straight- 
forward, we need only track a small number of compartments, and we do not require any closure approxima- 
tions, we propose that this is the "correct" perspective to study the deterministic dynamics of SIR epidemics 
on random networks. Other existing techniques rely on approximations [14] or produce complicated or large 
systems of equations 53, 27, 5 . This approach does not offer immediate insight into stochastic effects where 
methods such as branching processes are more appropriate. In later papers, we use this approach to derive 
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other generalizations, including different correlations in population structure, different disease dynamics, and 
populations that change structure in response to the spreading disease. 

Our approach is limited by the assumption that infection of one neighbor of u can be treated as inde- 
pendent of that of another neighbor. This is a strong assumption, and prevents us from applying this model 
to SIS diseases for which individuals return to a susceptible state. In this case, the assumption that u does 
not infect its neighbors can alter the future state of u. In the real population, if u becomes infected, it 
can infect neighbors who then infect u when u returns to a susceptible state. Thus if contact duration is 
nonzero, our predictions may be significantly altered. This limitation is often not recognized but may lead 
to important failures of mean-field or mass action models when applied to a population for which contact 
duration is important [7]. 

Treating neighbors as independent also means that if v and w are neighbors of u, we assume no alternate 
short path between v and w exists. In particular, we assume that clustering [S7J H3] is negligible: neighbors 
are unlikely to see one another. This assumption applies equally to most existing analytic epidemic models, 
but it can be eliminated in special cases using techniques similar to those of l 55 s 33. 144} [2Tj. 
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A Appendix 

In this appendix, we give additional information for the edge-based compartmental modeling approach for 
the spread of susceptible-infected-recovered (SIR) diseases in different types of static and dynamic networks. 
We give a more detailed discussion of the use of a test node and the assumption that test nodes do not 
cause infections. We then discuss the calculation of 72.o, the behavior of our equations at early time (showing 
that the thresholds they predict are the same as those given by TZq), and for some cases we give the final 
size prediction. Finally, we show that the MFSH models we have used are in fact equivalent to some more 
familiar existing models. 

B Selection of the test node 

The basis of our approach is the claim that the probability a randomly selected test node u is susceptible, 
infected, or recovered is equal to the proportion of the population that is susceptible, infected, or recovered. 
This claim implicitly assumes that the epidemic size grows deterministically: if stochastic effects could cause 
the outbreak to die out or even be slightly delayed, this claim is false. The probability a random node u is 
infected by time t depends on whether an epidemic happens and, if so, how delayed it is. So as in any ODE 
approach, our model is exact only once the outbreak is large enough to behave deterministically 

Our assumptions that the susceptible proportion of the population equals the probability u is suscep- 
tible, the proportion infected equals the probability u is infected, and the proportion recovered equals the 
probability u is recovered allow us to move our focus away from the proportion in each state. Instead we 
focus on the probability that u is in each state. Our goal remains to determine the course of the epidemic in 
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the entire population, but our method will be to focus on the equivalent problem of finding the probability 
the randomly chosen test node u has a given status. 

In order to calculate the probability that u is susceptible, infected, or recovered, we find another equivalent 
problem which is mathematically simpler. We make a simplifying assumption which allows us to treat 
neighbors of u as independent. As it stands, if w infects u, then u can infect another neighbor v, meaning 
the satus of v and w are not independent. We ignore this dependency, that is, we ignore transmissions from 
u to its neighbors. To make this mathematically precise, we prevent u from transmitting infection to its 
neighbors. This has no impact until after u is infected, so it has no impact on the probability u is susceptible. 
It may affect the state of neighbors of u once u is infected, but it has no impact on the duration of infection 
of u, and so it does not alter the probability that u is infected or recovered. Consequently, this alteration 
of u has no impact on the probability that u is in any given state. Thus our result for S, I, and R is not 
affected by preventing u from causing infection. 

Consequently, we can calculate S, I, and R as the probability that u is susceptible, infected, or recovered 
under the assumption that u is prevented from causing infection. The result will give the proportion of the 
population that is susceptible, infected or recovered in the original epidemic. 

C Simulation 

Both static networks and networks with mean field social heterogeneity satisfy the "time homogeneity" 
assumption of |23| . That is, given the properties of u and v, the a priori probability that u would transmit 
infection to v if u is infected is independent of the time at which u becomes infected. Consequently, for these 
cases we can use the Epidemic Percolation Network (EPN) approach of [22] . In this, we consider each node 
u in turn. We assume that u becomes infected and select the duration of infection from the appropriate 
exponential distribution of mean 7. Given the duration of infection, for every node v that u might infect, we 
calculate the probability that u infects v, and randomly determine whether u infects v, and if so, how long 
it takes. We then create a directed network by assigning edges from u to each node it would infect with the 
edge weighted by the associated duration. This directed network is an EPN. 

To simulate an epidemic, we can choose a node to be the index case. We then follow the epidemic as it 
passes from each node to the nodes that it would infect. If the outbreak remains small, we discard it. This 
can be done efficiently using Dijkstra's algorithm [13 . To quickly identify a node which sparks an epidemic, 
we can take the EPN and find the strongly-connected components within it. Above the epidemic threshold 
there is a single giant strongly-connected component. Any node from its "in-component" (including any 
node within the giant strongly-connected component) would spark an epidemic. We choose any of these 
nodes randomly and use it as the index case. 

The DVD, DFD, and DC models are harder to frame in terms of the EPN framework, so we use more 
traditional simulation techniques. We use a Gillespie-style event-driven algorithm [TH] and calculate whether 
the next event is a transmission, recovery, edge creation or edge breaking. For the DFD model, edges break 
in pairs and neighbors are swapped. These calculations are considerably slower because there are many 
events to track, only a few of which are directly relevant to disease transmission. 

D 7Zq, early growth, and final size 

In this section, we briefly turn away from the deterministic ODE methods and use branching process argu- 
ments to calculate TZq for each population. We then return to the ODEs and linearize the equations about 
the equilibrium corresponding to a fully susceptible population. We calculate the early growth rate, show 
that it is consistent with the branching process IZq above, and identify appropriate initial conditions. Finally, 
for most of the models, we are able to calculate a final size relation. 

The typically quoted definition of TZq is the number of new cases caused by a single randomly infected 
individual in a completely susceptible population. However, a more careful definition is necessary in cases 
where the average individual in the population may have different properties than the average infected 
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individual early in the epidemic. The appropriate definition of TZq is the number of new cases an average 
infected individual causes early in an outbreak [111 1511 15^1 152j . In particular, for an epidemic on a network, 
a single node chosen randomly in the population and then infected will have (on average) (K) neighbors 
to infect, while early on the typical infected node has higher degree than a randomly chosen node and has 
at least one neighbor which is no longer susceptible. Early in an outbreak, the probability mass function 
for a newly infected node in the actual degree case to have degree k is P n (k) = kP(k)/ (K), while in the 
expected degree case the probability density function for a newly infected node to have expected degree k 
is p n {n) = kp(k)/ (K). Consequently, we must account for the fact that such a node has higher degree than 
average, but we must also account for the fact that such a node cannot infect the source of its infection. 

In our calculation of the early growth, we (as expected) find that if TZq < 1, the disease has negative 
growth rate. We assume that the early growth is proportional to the leading eigenvector and use this 
to find appropriate initial conditions. In practice this is unnecessary because effectively any appropriate 
initial condition (with almost all individuals and stubs being in a susceptible state) quickly converges to the 
leading eigenvector. For our calculation of the final size, we are often able to identify a unique equilibrium 
corresponding to the state of the population after the disease has spread through. For some models this 
is not possible. As expected, if TZq < 1, we find that the only equilibrium corresponds to no large scale 
transmission, but if TZq > 1 there is an additional equilibrium which we can calculate to find the final size. 

Most of our calculations for the early growth and final size are done under the assumption that an epidemic 
occurs and in the limit that the initial proportion infected goes to zero. Thus our results are inappropriate 
for TZq < 1. In calculating (j)s and 4>r m terms of 9, we found that they take particular forms. However, 
the imposed initial conditions could be different. In the growing epidemic case, these early perturbations 
become insignificant as the number of infections becomes much larger than the initial conditions. However, 
in the case of a decaying epidemic, the initial number of infections is always significant compared to the 
later number. So the variations never disappear. Thus if the initial conditions do not satisfy the formulae 
we derived, the later solution does not either. This can still be handled using the edge-based compartmental 
modeling approach. To correct for this in the CM model (and similar models) we would need to find the 
equation for cj>s and <j>i (resulting in a system more like the DFD equations). 



D.l Actual degree models 
D.l.l CM 

TZq In a CM network the expected number of infections a newly infected node causes is TZq — J^k Pn{k)(k — 
I)f3/(f3 + 7) where /3/(/3 + 7) is the probability a node infects a neighbor prior to recovering. The reason for 
the — 1 is that a newly infected node has one neighbor (its infector) who is not susceptible, and so there 
are k — 1 susceptible neighbors. So 

KQ = Y J Prmk-i)^- 

k ^ + 7 

_ y> fc(fc - l)P(fc) (3 
-L. (K) /3 + 7 

f3 (K 2 -K) 



P + 7 (K) 



+ 7^(1) 

which is a well-known result for static CM networks. This calculation is in agreement with previous results 
for CM networks [32l|42l|2l |3l |5l] . 

In the special case of a network with a Poisson degree distribution, the probability of selecting a higher 
degree node and the reduction by one in the available number of susceptible neighbors exactly balance. So 
for the Poisson distribution (K 2 — = (K) 2 and TZq — f3 (K) /(/? + 7). However, this does not hold for 
more general distributions. 
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Early Growth and Initial Conditions We return to the deterministic equations 



9 = ^9 + ^+^1-9), 
R = jl, S = ip(9), I = l-S-R. 

Clearly 9 = 1 is an equilibrium solution corresponding to no transmission. To test its stability, we linearize 
about 8 = 1. We set 8 = 1 + e. At leading order we find 



So at early times e = Ce xt where 



The equilibrium loses stability as A transitions from negative to positive, flip'' (1) / V '(l)(/3+7) , which is exactly 
the condition for IZq to transition from below 1 to above 1. Both methods predict the same threshold. 

To find appropriate initial conditions for S, I, and R, we could simply take S = ?p(9), and choose 
any nonnegative I and R such that 1 = S + I + R. As we solve forward, any error in I and R decays 
exponentially quickly. If we wish to be more precise, we note that I = —S — jl, and at leading order 
S = 0ip'(O) = ACe A V(i) to leading order. We will have I = Ke M , and we need to find K in terms of C. 
We get \K e At = - C ( 1 ) e At - jK e xt . Solving this gives K = -CXtp' (1)/ (7 + A) , so the appropriate initial 
condition is 

e(Q) = i + c, s(o) = ^(0(0)), i(o) = - cx ^, fl(o) = 1 - z(o) - s(o) 

7 + A 

where C is a small, negative number. 

However, in practice, there is no need to do this. / and R have no role to play in determining 9. We 
simply require that I + R = 1 — ip{9) initially. Although our initial distribution of probability to / and R 
may differ from the true amount, it is a small effect initially and decays exponentially. So in practice we can 
use any convenient assumption. 

Final Size To calculate the final size, we note that as the epidemic dies out, the derivatives must all go 
to zero. Thus we can set 9 = and solve for 9 {00). Note that (if 1Z > 1) this has two solutions, as there 
are two equilibrium conditions. In one equilibrium the disease has not been introduced and 9 = 1, while 
in the other the disease has spread and died out and 9 < 1. We want the smaller of the solutions, which 
corresponds to an epidemic occuring. We solve 

[ ' /3 + 7 /3 + 7 

for the smaller solution. In practice, this can be done by using a guess 9\ = 0, and then plugging 9i into 
the right hand side to find This iteration converges quickly, and if 1Zq > 1, the attracting solution 

is the solution we want. The total fraction of the population infected in the course of an epidemic is 
R(oo) = 1 - ip(9(oo)). 

D.1.2 Actual Degree MFSH 

IZo To find IZo for the actual degree formulation of the MFSH model, we consider a newly infected node 
early in the epidemic. The probability it has degree k is P n {k). Because it has a new set of neighbors at 
each moment, we do not have to account for the fact that it cannot infect the source of its infection, nor do 
we have to account for the fact that once it infects a neighbor, it cannot infect the neighbor again. Thus 
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at all times it has k susceptible neighbors, so it causes new infections at rate j3k for the entire time it is 
infected. The average duration of infection is I/7, so the expected number of infections caused given k is 
/3k/j. Averaging over k, we have 

k 7 
_ p x ^ k 2 P(k) 

"7^ W) 

7 (K) 



7 V^'(l) +1 

Early Growth and Initial Conditions We begin with the equations 

ip (1) 

R = jl, S = ip(6), I = l-S-R 

we proceed similarly to the CM case. We set 9 = 1 + e and at leading order we find 

After some rearrangement, we have e — [/3 — 7 + (3i/)"(l)/ip'(l)]e. So e = Ce xt where 

The equilibrium loses stability exactly where 7^ = 1- The remaining calculations are identical to those of 
the CM case, and we find that the appropriate initial conditions are as before except that the value of A is 
different 

0(O) = 1 + C, S(0) = ^(0(0)), J(0) = -^ffi>, R(0) = 1 - 7(0) - S(0) 

7 + A 

(recall C is a small, negative number). As before, any reasonable initial condition with 9 close to 1, S = ij>(9) 
and I + R = 1 — S would be acceptable. 

Final Size To find the final size of an epidemic, we set 9 to zero and solve for 9. We find 

.(oo)= e xpf-^l-^» 

If TZq > 1 this has two solutions, one with 9 = 1, and one with < 9 < 1, which is the solution of interest. 
Once this is found, the total fraction infected is R(oo) = 1 — ip(9(oo)). 

Note that if ip{x) = x k for some k, this corresponds to the MFSH model with all individuals having the 
contact rate which is the MA model and TZq — fc/8/7. We find 

9 = cxp(^-^[l-9 k } 
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Rewriting the left hand side as 9 = S x / k = (1 — i?(oo)) 1 /' c and raising both sides to the fc'th power, we have 

k/3 



1 — i?(oo) = exp 



7 



-R(oo) 



Which is the well known final size relation for the MA model 

R(oo) = 1 - exp(-^ i?(oo)) 

D.1.3 DFD 

IZq To calculate TZq for this model, consider a randomly chosen newly infected node early in the epidemic. 
It has degree k with probability P n (k). Initially this node has k — 1 available susceptible neighbors. Let us 
focus instead on the one edge from the infection source. The stub may result in more infections if the edge 
is broken and reformed. The probability that it breaks and reforms prior to recovering is 77/(77 + 7). The 
probability that it then causes infection prior to recovering is /?/(/? + 7) . At this point the stub is connected 
to an infected neighbor, the same state it was at the beginning of infection and the process repeats. So 
the probability this stub infects at least n nodes is r n where r — 77/?/ [(/3 + 7)(?7 + 7)]. Summing this gives 
an expectation of r/(l — r) new infections for this stub. Now consider one of the k — 1 stubs that are not 
the source of infection. The probability that this stub transmits infection at least once is /?/(/? + 7). After 
this it is like the stub that received infection. Thus the expected number of infections such a stub causes is 
[f3/ ((3 + 7)] [1 + r/(l — r)] which can be rearranged into [(77 + 7)/7?][r/(l — r)}. 
Adding these together, we see that a newly infected node is expected to cause 



n = Y,Pn{k) 

k 

l-r4- (K) 



1-r 

kP(k) 



l + (fc-l) 



?7(1 - r) 
77 + 7' 



n 



1 - r\ + V (K) J 



(/3 + T7 + 7) \1 7 (K) 

= P (V , 77 + 7^"(l) 
(/3 + T7 + 7) V7 7 V>'(1) 

where we have substituted for r = 77 /3/[(f3 + 7) (77 + 7)]. 
Ealry Growth and Initial Conditions Our equations are 



= -p<Pi ■ 



9/ = 



lb" (6) 

/30/0s|t^| + r707rj - (/3 + 7 + 77)0/ , 



R = il, 



0*1/(0) 

WO) 
S(t) = m , 



Tj = 1 - TTR - Its ; 
I(t) = 1-S-R. 
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Here we have a higher dimensional problem, and the equilibrium of interest is 9 = 1, (ps = 1, <pi = 0, 
its = 1, an d 17 = 0. We set 6 = 1 + ei, (f)s = 1 + £2, and (pj = e 3 . For its we use 7rs = dtp' \9)/ip' (1). For 
717, we set 717 = e 4 and use the fact that 717 = — 7V5 — 7717. We linearize about the equlibrium. We find 

£1 = -/?£ 3 



£2 



^"(1) 2^(1) +^'(1) 
-p£3 + V 777-7 ei 



£ 3 = /3 



^(1) 



^(1) 

£3 + J?e 4 - (/3 + 7 + V) e 3 



V e 2 



. , #)+f(l) 
64 = ^(1) £3 

which can be rewritten as the matrix equation 



7 £ 4 



d_ 
dt 







£2 




£3 




\£ 4 y 







2^(1)+^"(1) 
»/ V-'(l) 






-13 



-f3 



V 



/3 



+ 7 + ^7) 

V-'(i) 





-7/ 









£2 




£3 







The standard solution technique for this is to find the largest eigenvalue of the matrix. It is relatively 
straightforward to show that and — r\ are always eigenvalues of this matrix. The other two turn out to be 
the eigenvalues of the 2x2 matrix forming the lower right corner. They are 



(27 + /? + ?? -/?^)± A /(27 + /3 + r ? 



4 ( 7 (/3 + 7 + 77) - 7/3 



taw 



•0(1 + $®)) 



Ai 2 — 

z 

For our above expression, if IZo = 1, then ip" (1) / ip' (1) = (7 — r?)/(7 + r?) + 7//3- Placing this into our 
expression above, the largest eigenvalue becomes 0. If ip" (1) / ip' (1) is larger than this threshold (1Z > 1) 
then an epidemic can occur. If it is less than this threshold, this eigenvalue goes below zero, but there is still 
an eigenvector of zero whose eigenvector has 63 and 64 both zero (corresponding to <pi and 717 both zero). 
If 7^o is less than 1, the values of £3 and £4 will decay according to the largest eigenvalue whose eigenvector 
has nonzero entries in the appropriate component. 

For initial conditions, if 1Z > I, we take A to be the largest eigenvalue, and the vector (a,b,c,d) to 
be the corresponding eigenvalue. Then the appropriate initial conditions are found by taking 9 = 1 + ea, 
(pS = 1 + £&, 4>i = ec ! an d 717 = £^ where £ <C 1. Note that we must choose our eigenvector such that a < 0. 
In practice however, so long as the initial amount of infection is taken to be very small, the initial conditions 
need not take this exact form; the solution will quickly converge to something of this form. If IZo < 1, then 
the appropriate initial conditions come from a linear combination of the eigenvector of and the decaying 
eigenvector. The coefficient of the 0-eigenvector will be very small unless the initial introduction infected 
many individuals. 

We note that in the previous cases, if 7Zq < 1, then the pertubation to 8 decays, and 9 returns to 1. 
Physically, this is an unrealistic result: it says that if transmission has already happened, then as time 
progresses, transmission is undone. Here we do not see that. If transmission has happened, then it does 
not decay, corresponding to the eigenvector of 0. The reason that this model is more correct, is that in the 
previous models, we found a relation between <pj and 9. This relation implicitly assumes that the epidemic 
is growing. If it is not growing, then this relation does not hold. Eliminating the assumption from those 
models would result in additional equations for ^5 and epi, and the system would look more like the DFD 
model. 



Final Size We have not been able to find a simple expression for the final size of an epidemic in this case. 
The system has multiple equilibria corresponding to possible states after the disease was introduced. In 
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the previous cases, we were able to find a closed form for the relation between 9 and 4>i. The assumption 
made there was equivalent to stating that the early growth is dominated by the largest eigenvector. In the 
previous cases, this assumption led to an analytic relation between (f>j and 9. In our case, we still want 
to make the equivalent assumption, which gives a constraint that determines which equilibrium is the final 
state. However, we have not found a way to impose the constraint analytically. Instead we must solve the 
system using initial conditions corresponding to a small number of cases to find the correct final size. Thus 
we only have the final size as a numerical prediction. 



D.1.4 DC 



To calculate 7l for the DC model, we first define r a , r d , and r s to be the expected number of infections 
caused by a stub prior to recovery given that the stub is active and connected to a node other than the 
source, dormant, or active but connected to the source of infection at the time of infection. 

It is straightforward to show that if the stub is active and connected to a node other than the source, the 
probability that the edge transmits prior to breaking or recovery is (3/(f3 + 7 + 772 ) . The probability it breaks 
prior to recovery is 772/(7 + V2 )■ Once it breaks, it is equivalent to a stub that was dormant at infection. 
Thus r a = /3/(/3 + 7 + 773) + 772^/(7 + 

To find rd, we note that a dormant stub must find a neighbor prior to recovery before it can cause 
any transmissions. Once this happens, it is equivalent to a stub that was active at infection. Thus = 
Vi r a/(j + Vi)- Combining this with our expression for r a , we have 



r„ = 



Pirn + i){m + 7) 
7(7 + m + m){!3 + i + m)' 



To find r s , we note that infection cannot happen along that stub until the stub breaks and reforms at 
which point it is equivalent to an active stub, so r s — r\\r\2T a l\(l + m){l + m)]- 

The probability that a stub is active is £ = 771/(771 + 772) and the probability it is dormant is 7r = I — £. 
The total number of infections a node with deg expected to cause is r s + (k — l)£r a + (k — 1)(1 — r <<- 

Since the probability a newly infected node has degree k m is P n (k m ) = k m P(k m )/ (K m ), we find 



^0 = p n(k m )[{k m - l)£r„ + {k m - 1)(1 - £)r d + r s ] 



E kmP T } k ™ ] [{k m - l)£r„ + {km - 1)(1 - t)r d + r s ] 



E 



{K m ) 

P(km) 
{K m ) 



(km-l)[t+(l-0- 



m 



771772 



7 + W {j + vi){i + m) 



(K m -K m ) r\\ 7 + »7i + 



m 



vim 



{K m ) m + m i + m (i + m){i + m)) 

(K m -K m ) -qx 772+7 



+ 



mm 



(K m ) 771 + 772/3 + 7 + 773 (7 + 771 +m)(J3 + i + m) 



13 ({K m - K m ) m m + i 



+ 



771 772 



[3 + m + i\ {K m ) m + m i id + m+m) 



{3 



'V>"(i) m m + i 



+ 



mm 



[3 + m + 7 \V''(i) m + m i id + m+m) 



Early Growth and Initial Conditions We have not attempted to calculate the early growth rate because 
showing the details will not be particularly informative. The method is similar to that for the DFD model. 
If we wish to use appropriate initial conditions, we simply begin with 9 approximately 1, </>s approximately 
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7r, 4>d approximately 9 — 4>s, tts approximately 7r, and £s approximately £. We can make all the R variables 
0, and then set the / variables to J = 1 — S, (pi = 9 — <f>s — 4>d, tti = it — its, and = £ — £5. This will 
converge relatively quickly to the appropriate eigenvalue. Alternately, we could solve the linear system and 
identify the appropriate eigenvalue and use it to find the initial conditions. 

Final Size As in the DFD case, we need an additional constraint to identify the appropriate equilibrium. 
We do not have this constraint analytically, so we must solve the ODE system numerically to find the final 
size. 

D.2 Expected Degree Models 
D.2.1 MP 

IZq We calculate IZo much as in the CM network. We focus on all individuals with a given expected degree 
k: these nodes have a Poisson degree distribution, and the fact that those with higher degree are more 
likely to become infected exactly cancels the reduction in available contacts, and so the expected number 
of remaining contacts of a newly infected node with expected degree n is k. So the expected number of 
infections such a node causes is k/3/(7 + /?). To find TZq, we must take a weighted average over the value of 
k for newly infected individuals. 

The probability a newly infected individual has expected degree k is p n (ft). So we find 




(K) /3 + 7 
fi + 1 



where \Kj denotes the average of n 2 . It turns out (Jc 2 
CM result. 



> 



(K 2 — K}, so this result is the same as the 



Early Growth and Initial Conditions To calculate the 



early growth, we take 




R = jl, S = *(9), 



I = l-S-R. 



and set 9 = 1 + e. At leading order we have 




We find e = Ce xt where 




R(0) = 1 - 1(0) - S(0) 
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Final Size The final size of epidemics in MP networks can be calculated in much the same way as for CM 
networks. We set = and find 

7 P *'(QM) 

Then S(oo) = *(0(oo)) and R(oo) = 1 - S(oo). 
D.2.2 Expected Degree MFSH 

TZq To find IZo for the actual degree formulation of the MFSH model, we consider a newly infected node 
early in the epidemic. The probability density function for the expected degree k is p n (/v). Because it has a 
new set of neighbors at each moment, we do not have to account for the fact that it cannot infect the source 
of its infection, nor do we have to account for the fact that once it infects a neighbor, it cannot infect the 
neighbor again. Thus on average it has k susceptible neighbors, so it causes new infections at average rate 
/3k for the entire time it is infected. The average duration of infection is I/7, so the expected number of 
infections caused given k is f3n/-f. Taking the average over all k, we have 



,00 / 

/ Pn(«)- 

Jo 

f°° k 2 P (k) 
7 Jo (K) 



7 (K) 
_ /3*"(1) 
~ 7 *'(l) 

Early Growth and Initial Conditions Our governing equations arc 

e = -^ + ^ + 7(i-e) 

R = -yI, S = ^(B), I = l-S-R 

Setting = 1 + e, we have 



So e = Ce xt where 



We see that the threshold for A = is again the same as TZq = 1. 

To find the initial conditions, we repeat our previous approach and find 

9(0) = 1 + C, 5(0) = *(e(0)), J(0) = - W' , R(0) = 1 - 1(0) - 5(0) 

7 + A 

where C is a small, negative number. 

Final Size To find the final size we set — and find that 0(oo) solves 

7 V 

Then we have R(oo) = 1 - *(0(oo)). 
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D.2.3 DVD 



To calculate IZo for the DVD population, we begin by considering a newly infected node soon after disease is 
introduced. Because nodes are infected with probability proportional to their expected degree, the probability 
density function for a node to have expected degree /c given that it is newly infected is p n (n) = Kp(n)/ (K). 
Given a newly infected node with expected degree /c, the expected number of additional neighbors (other 
than its infector) it has is also n (as in the static MP case). For each of those neighbors, the probability that 
it transmits infection prior to recovering or breaking the edge is /?/(/? + t] + 7). So the expected number of 
transmissions to neighbors it has when the infection occurs is K,p{n)f3 /[(K) (/? + r\ + 7)]. 

However, the node also has the opportunity to infect neighbors that it gains during its infectious period. 
The probability that it creates a new edge before recovering is given by considering the recovery rate 7, and 
the edge creation rate nr\. We track edge creations before recovery. The probability that at least one edge 
creation occurs Ki] / (7 + /cry) . More generally, the probability that at least n edge creations is [kij/(j + Krj)] n . 
If it gains at least n neighbors, the probability that it infects the n-th neighbor before recovering or breaking 
the edge is /?/(/? + r? + 7). So the probability that a node creates an n-th neighbor and infects that neighbor 



The expected number of newly created neighbors which it infects can be found by summing the probability 
that a node creates and infects an n-th neighbor over all n. This gives [/3/(/3 + n + 7)] X)rJ K? 7 / '(7 + Kr ))] n = 
[/?/(/? + 77 + 7)] [/cry/7]. Adding the expected number of new and original neighbors infected together, the 
expected number of infections a node with k causes is [fi/ (J3 + n + 7)]/c[l + 77/7]. Taking a weighted average 
over all k gives 



The terms in this expression may be interpreted as follows: \K 2 J / (K) gives the expected value of k 

for a newly infected node, f3/((3 + n + 7) gives the probability that an edge which exists at any point 
during the infectious period will transmit infection prior to breaking or the infectious period ending, and 
{V + 7)/7 = 1 + Vl/l] gives the expected number of susceptible contacts per expected degree to exist at 
infection (1) or be created prior to recovery (77/7). 

Early growth We take the equations 



is [13/ {(3 + 77 + 7)][(/cr?)/(7 + « + »?)] 
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fi + f] + 7 7 (K) 
/3 + r? + 7 7 
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We set 9 = 1 + ei and U R 



R 



7 n 7 , n s = *'(e)/*'(i) , n 7 = i-n s -n fl) 
7/, s = *(e), i = i-s-r. 

£2- We note that 11^ = 71T/ = 7(1 — lis — IXr). At leading order we have 




which becomes 
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*'(i) 



7*»(i) 
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The eigenvalues of a 2 x 2 matrix solve A 2 — TA + D where T is the trace and D the determinant. So the 
dominant eigenvalue is 

_ T + VT 2 - AD 
X 2 

If T > 0, then the growth rate is positive. To show that T > implies TZo > 1, note that T > implies 
/3ip"(l)/ip'(l) > j3 + 7 + rj. From this the product of the first two factors in our expression for TZq is greater 
than 1. Because (r) + 7V7 > 1, it follows that TZq > 1. If T < 0, then our equations predict growth if and 
only if D < 0. To complete our argument that the equations predict growth exactly when TZo > 1, we must 
show that if T < 0, then 7\Lo > 1 is equivalent to D < 0. We can show that 

D = -[3(1! + 7) ^t^I +J((3 + 1 + V) 

From this a small amount of algebra shows D < is equivalent to TZo > 1. Thus regardless of the sign of T, 
A > exactly when TZo > 1, an d conversely A < exactly when TZo < 1- So the predicted thresholds are the 
same. 

To find the appropriate initial conditions, we can again take any sufficiently small reasonable initial 
condition and the particulars of the initial condition will be unimportant. Alternately, we can note that the 
solution for (61,62) must converge to Ce At v where v is the eigenvector of the eigenvalue A. This takes the 
value 

/A + 7 \ 
v= ™ 

From this it is straightforward to find the appropriate initial conditions using the approaches seen before. 



Final Size At the end of the epidemic, no infected nodes remain, and so I (00) = $/(oo) = IIj(oo) = 0. 
We have n^(oo) = 1 - n s (oo) = 1 - *'(6(oo))/*'(l). Setting 6 = we find 

ft / \ _l / r/ + 7^(6(00)) 77 + 7 77 \ 

[00) ~P + V + l\ 7 *'(1) P l) 

We can solve this for 0(oo) using iterative methods. The total fraction infected is 

R=l-S = 1- *(6(oo)) 



E Equivalence of MFSH models with pre-existing models 

The basic equations for the MFSH model used by other authors QJ HH1 [301 SOI HE] are 

S k = -/3kS k ( 

I k = (ikSkC ~ lh 

c 



E fe kP(k)I k 



(K) 

However, in the actual degree case we have derived 

= -/30 + /^M-07ln0 (28) 

R = 71 , 3 = i){9) , 1 = 1- S-R (29) 

It is not immediately obvious that these are equivalent. To see that they are, we first reduce the dimensions 
of the first system. We note that the equation for Sk has as solution 

g. =e -^/l 00 C(*')d i ' 
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We set a 



-^/i«.c(*')dt' 



and then Sk — a k . Our goal is to show that in fact, a solves the same equation 



as 9. We begin by noting that 

So C = -a/ 13a 

We now move to finding (. 
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We substitute ( = —a/flot to express this as a derivative. 



We can integrate this to find 



(using the fact that £ — > and a 



C = — Q! 

d_ 
~ di 



C = i 



. SM'(a)) 



aip'(a) 



7 



7 a 
In a 



^'(1) /? 
1 at early time) and so a 



In a 

= —f3a( becomes 
— a-f In a 



which means that a solves the same equation as 9 for the fixed degree version of the MFSH equations. Since 
Sk = a k is the same formula as we would find for S k in terms of 9, this shows that in fact the two systems 
of equations are equivalent. 

We are not the first to see that the usual system can be simplified into a handful of equations, but the 
approach we have used to derive these equations is new. Previous authors have simply observed that the Sk 
equation can be solved, done so, and then used a change of variables. The resulting equations are equivalent 
to our own, but are written in terms of slightly different variables. The advantage of our system is that the 
variables connect more easily to meaningful quantities, so it can be derived directly, and it can be related to 
the other edge-based compartmental models. 

The usual model can be altered to allow for continuous contact rates, which would yield 

S K = —0kS k C, 
I K = (3kS k ( - jI K 
_ J °° kp(k)I k dK 
C (K) 
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A similar approach shows that this is equivalent to our expected degree formulation of the MFSH equations. 
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